Grammar Extraction and Refinement from an HPSG Corpus
نویسنده
چکیده
Grammar learning and refinement on the basis of language resources is very appealing in comparison with manual development of formal grammar. But in order to learn a complex grammar a complex resource is needed. Thus the creation of language resources and learning of grammars from them have to be aware of each other. In this paper we define a formal basis for annotation of corpora with respect to a contemporary linguistic theory — HPSG and a methodology for extraction of grammars from such corpora. Also we describe an approach to incremental refinement of an HPSG grammar in the process of annotation of an HPSG Corpus.
منابع مشابه
Exploring HPSG-based Treebanks for Probabilistic Parsing HPSG grammar extraction
We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of...
متن کاملExtracting Supertags from HPSG-based Tree Banks
We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of...
متن کاملExploring HPSG-based Treebanks for Probabilistic Parsing
We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of...
متن کاملCorpus-Oriented Development of Japanese HPSG Parsers
This paper reports the corpus-oriented development of a wide-coverage Japanese HPSG parser. We first created an HPSG treebank from the EDR corpus by using heuristic conversion rules, and then extracted lexical entries from the treebank. The grammar developed using this method attained wide coverage that could hardly be obtained by conventional manual development. We also trained a statistical p...
متن کاملUnsupervised Lexicon Acquisition for HPSG-Based Relation Extraction
The paper describes a method of relation extraction , which is based on parsing the input text using a combination of a generic HPSG-based grammar and a highly focused domain-and relation-specific lexicon. We also show a method of unsupervised acquisition of such a lexicon from a large unla-beled corpus. Together, the methods introduce a novel approach to the " Open IE " task, which is superior...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002